Project Motivation: Investigate the impact of pandemic restrictions on hockey player development.
In 2020-2021, many hockey leagues had either a shortened season or no season at all due to COVID-19. When a league, such as the OHL, was shutdown, players had to find different leagues and/or tournaments to participate in. Some players were unable to practice with a team for that season. How did this disruption in training influence a player’s development? To answer this question, we will examine data from the 2019-2020, 2020-2021, and 2021-2022 seasons for junior leagues.
| team_name | season | league | gp | g | a | pts | pm |
|---|---|---|---|---|---|---|---|
| Hamilton Bulldogs | 2019-2020 | OHL | 59 | 3 | 14 | 17 | 2 |
| Windsor Spitfires | 2021-2022 | OHL | 67 | 11 | 18 | 29 | 28 |
| Barrie Colts | 2021-2022 | OHL | 39 | 1 | 5 | 6 | -3 |
| Mississauga Steelheads | 2021-2022 | OHL | 57 | 8 | 35 | 43 | 12 |
| Soo Greyhounds | 2021-2022 | OHL | 43 | 1 | 4 | 5 | -2 |
## [1] "proportion who played is 0.178"
#### Continuous Age vs PPG
##
## Call:
## lm(formula = ppg_total ~ position * plyr_quality + treatment *
## drafted + gp_total + plyr_quality * gp_total + plyr_quality *
## drafted + age_continuous * plyr_quality + season, data = ohl_filtered)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.56219 -0.10026 -0.01353 0.06968 0.83082
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.6487089 0.2993550 2.167 0.03079 *
## positionF 0.1050757 0.0329205 3.192 0.00152 **
## plyr_quality 0.2821307 0.5193237 0.543 0.58723
## treatmentPlayed 0.0019808 0.0336923 0.059 0.95315
## draftedTRUE 0.0859495 0.0444539 1.933 0.05384 .
## gp_total 0.0031135 0.0007901 3.940 9.51e-05 ***
## age_continuous -0.0482747 0.0174768 -2.762 0.00599 **
## season2021-2022 0.3427863 0.0337114 10.168 < 2e-16 ***
## positionF:plyr_quality -0.0270689 0.0968855 -0.279 0.78008
## treatmentPlayed:draftedTRUE 0.0327579 0.0589740 0.555 0.57887
## plyr_quality:gp_total 0.0008258 0.0020820 0.397 0.69185
## plyr_quality:draftedTRUE -0.0525461 0.0888778 -0.591 0.55469
## plyr_quality:age_continuous 0.0325024 0.0285984 1.137 0.25638
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1935 on 425 degrees of freedom
## Multiple R-squared: 0.7396, Adjusted R-squared: 0.7322
## F-statistic: 100.6 on 12 and 425 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = log(1 + ppg_total) ~ position * plyr_quality + treatment *
## drafted + gp_total + plyr_quality * gp_total + plyr_quality *
## drafted + age_continuous * plyr_quality + season, data = ohl_filtered)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.32901 -0.05826 -0.00607 0.04456 0.39200
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.183278 0.177310 1.034 0.30188
## positionF 0.084160 0.019499 4.316 1.98e-05 ***
## plyr_quality 0.865576 0.307600 2.814 0.00512 **
## treatmentPlayed 0.002898 0.019956 0.145 0.88459
## draftedTRUE 0.071914 0.026330 2.731 0.00657 **
## gp_total 0.002811 0.000468 6.007 4.06e-09 ***
## age_continuous -0.017665 0.010352 -1.707 0.08864 .
## season2021-2022 0.208512 0.019967 10.443 < 2e-16 ***
## positionF:plyr_quality -0.077930 0.057386 -1.358 0.17518
## treatmentPlayed:draftedTRUE 0.020942 0.034931 0.600 0.54915
## plyr_quality:gp_total -0.001254 0.001233 -1.017 0.30985
## plyr_quality:draftedTRUE -0.096164 0.052643 -1.827 0.06844 .
## plyr_quality:age_continuous -0.007997 0.016939 -0.472 0.63712
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1146 on 425 degrees of freedom
## Multiple R-squared: 0.7686, Adjusted R-squared: 0.7621
## F-statistic: 117.7 on 12 and 425 DF, p-value: < 2.2e-16
Takeaways:
##
## Call:
## lm(formula = sqrt(1 + ppg_total) ~ position * plyr_quality +
## treatment * drafted + gp_total + plyr_quality * gp_total +
## plyr_quality * drafted + age_continuous * plyr_quality +
## season, data = ohl_filtered)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.209227 -0.037682 -0.005305 0.026890 0.273866
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.1844952 0.1139609 10.394 < 2e-16 ***
## positionF 0.0472054 0.0125325 3.767 0.000189 ***
## plyr_quality 0.3270471 0.1977004 1.654 0.098814 .
## treatmentPlayed 0.0012675 0.0128263 0.099 0.921325
## draftedTRUE 0.0395618 0.0169231 2.338 0.019864 *
## gp_total 0.0014869 0.0003008 4.943 1.11e-06 ***
## age_continuous -0.0150411 0.0066532 -2.261 0.024281 *
## season2021-2022 0.1329864 0.0128335 10.362 < 2e-16 ***
## positionF:plyr_quality -0.0298963 0.0368832 -0.811 0.418067
## treatmentPlayed:draftedTRUE 0.0132330 0.0224507 0.589 0.555890
## plyr_quality:gp_total -0.0002172 0.0007926 -0.274 0.784167
## plyr_quality:draftedTRUE -0.0408806 0.0338347 -1.208 0.227625
## plyr_quality:age_continuous 0.0038789 0.0108871 0.356 0.721805
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.07366 on 425 degrees of freedom
## Multiple R-squared: 0.756, Adjusted R-squared: 0.7491
## F-statistic: 109.7 on 12 and 425 DF, p-value: < 2.2e-16
Takeaways:
##
## Call:
## glm(formula = ppg_alt ~ position + plyr_quality + treatment +
## drafted + gp_total + age_continuous + season, family = Gamma,
## data = ohl_filtered)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.04781 -0.45634 -0.05151 0.29262 1.44723
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.74752 1.10045 6.132 1.97e-09 ***
## positionF -0.64466 0.15355 -4.198 3.27e-05 ***
## plyr_quality -1.60164 0.16448 -9.737 < 2e-16 ***
## treatmentPlayed 0.08761 0.11320 0.774 0.4394
## draftedTRUE -0.27769 0.11582 -2.398 0.0169 *
## gp_total -0.03088 0.00474 -6.515 2.04e-10 ***
## age_continuous -0.06439 0.06242 -1.032 0.3029
## season2021-2022 -0.71982 0.16649 -4.324 1.91e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Gamma family taken to be 0.3145594)
##
## Null deviance: 428.22 on 437 degrees of freedom
## Residual deviance: 278.76 on 430 degrees of freedom
## AIC: 70.451
##
## Number of Fisher Scoring iterations: 6
Takeaways:
This model does not adequately capture the relationship between our predictors and response.
Note: Tried to fit a GLM with interaction using the default Gamma link function (inverse), but it threw an error and said “no valid set of coefficients has been found: please supply starting values”
##
## Call:
## glm(formula = ppg_alt ~ position * plyr_quality + treatment *
## drafted + gp_total + plyr_quality * gp_total + plyr_quality *
## drafted + age_continuous * plyr_quality + season, family = Gamma(link = log),
## data = ohl_filtered)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.99302 -0.26334 -0.02853 0.16594 2.28235
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.116996 0.788636 -7.756 6.54e-14 ***
## positionF 0.401587 0.086727 4.630 4.86e-06 ***
## plyr_quality 13.999425 1.368133 10.233 < 2e-16 ***
## treatmentPlayed 0.095492 0.088761 1.076 0.28261
## draftedTRUE 0.618591 0.117112 5.282 2.04e-07 ***
## gp_total 0.029064 0.002082 13.962 < 2e-16 ***
## age_continuous 0.120041 0.046042 2.607 0.00945 **
## season2021-2022 0.797146 0.088811 8.976 < 2e-16 ***
## positionF:plyr_quality -0.655966 0.255240 -2.570 0.01051 *
## treatmentPlayed:draftedTRUE 0.098357 0.155364 0.633 0.52703
## plyr_quality:gp_total -0.049748 0.005485 -9.070 < 2e-16 ***
## plyr_quality:draftedTRUE -1.295495 0.234144 -5.533 5.52e-08 ***
## plyr_quality:age_continuous -0.459833 0.075341 -6.103 2.34e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Gamma family taken to be 0.2598456)
##
## Null deviance: 428.22 on 437 degrees of freedom
## Residual deviance: 180.03 on 425 degrees of freedom
## AIC: -127.11
##
## Number of Fisher Scoring iterations: 13
Takeaways:
This model does not adequately capture the relationship between our predictors and response (relationship appears non-linear).
Predictions are not on the correct scale either.
##
## Call:
## glm(formula = ppg_alt ~ position * plyr_quality + treatment *
## drafted + gp_total + plyr_quality * gp_total + plyr_quality *
## drafted + age_continuous * plyr_quality + season, family = gaussian(link = log),
## data = ohl_filtered)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.52455 -0.10400 -0.02565 0.07277 0.79096
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.799200 0.675378 -7.106 5.08e-12 ***
## positionF 0.709110 0.095354 7.437 5.75e-13 ***
## plyr_quality 6.320262 0.855256 7.390 7.86e-13 ***
## treatmentPlayed 0.031932 0.067900 0.470 0.638393
## draftedTRUE 0.464459 0.082106 5.657 2.84e-08 ***
## gp_total 0.013898 0.002486 5.590 4.06e-08 ***
## age_continuous 0.106930 0.036110 2.961 0.003236 **
## season2021-2022 0.495171 0.054993 9.004 < 2e-16 ***
## positionF:plyr_quality -0.975732 0.189273 -5.155 3.89e-07 ***
## treatmentPlayed:draftedTRUE -0.011150 0.089273 -0.125 0.900664
## plyr_quality:gp_total -0.009111 0.003242 -2.810 0.005177 **
## plyr_quality:draftedTRUE -0.447553 0.124028 -3.608 0.000345 ***
## plyr_quality:age_continuous -0.198948 0.043897 -4.532 7.60e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.0376268)
##
## Null deviance: 61.100 on 437 degrees of freedom
## Residual deviance: 15.991 on 425 degrees of freedom
## AIC: -178.87
##
## Number of Fisher Scoring iterations: 7
Takeaways:
Non-constant variance, not normally distributed towards lower predicted values.
Models relationship better than previous models.
Not predicting PPG on the right scale.
Note: Many teams had the same amount of players with a -1 plus-minus.